Prosper is an American peer-to-peer lending platform. The site allows borrowers to post a listing for a chosen loan amount and purpose. Investors are then given the opportunity to invest in loans of their choice. Prosper collects data on borrower details and provides risk ratings for investors.
The dataset is from the P2P loan platform prosper.com. There are 113937 observations which contain the information of loans in 81 variables.
## 'data.frame': 113937 obs. of 81 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
## $ CreditGrade : Factor w/ 8 levels "A","AA","B","C",..: 4 NA 7 NA NA NA NA NA NA NA ...
## $ Term : int 36 36 36 36 36 60 36 36 36 36 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : Factor w/ 2802 levels "2005-11-25 00:00:00",..: 1137 NA 1262 NA NA NA NA NA NA NA ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating..numeric. : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating..Alpha. : Factor w/ 7 levels "A","AA","B","C",..: NA 1 NA 1 5 3 6 4 2 2 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory..numeric. : int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerState : Factor w/ 51 levels "AK","AL","AR",..: 6 6 11 11 24 33 17 5 15 15 ...
## $ Occupation : Factor w/ 67 levels "Accountant/CPA",..: 36 42 36 51 20 42 49 28 23 23 ...
## $ EmploymentStatus : Factor w/ 8 levels "Employed","Full-time",..: 8 1 3 1 1 1 1 1 1 1 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
## $ CurrentlyInGroup : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
## $ GroupKey : Factor w/ 706 levels "00343376901312423168731",..: NA NA 334 NA NA NA NA NA NA NA ...
## $ DateCreditPulled : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : Factor w/ 11585 levels "1947-08-24 00:00:00",..: 8638 6616 8926 2246 9497 496 8264 7684 5542 5542 ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent..percentage. : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
## $ IncomeVerifiable : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
## $ LoanOriginationQuarter : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
## ListingKey ListingNumber
## 17A93590655669644DB4C06: 6 Min. : 4
## 349D3587495831350F0F648: 4 1st Qu.: 400919
## 47C1359638497431975670B: 4 Median : 600554
## 8474358854651984137201C: 4 Mean : 627886
## DE8535960513435199406CE: 4 3rd Qu.: 892634
## 04C13599434217079754AEE: 3 Max. :1255725
## (Other) :113912
## ListingCreationDate CreditGrade Term
## 2013-10-02 17:20:16.550000000: 6 C : 5649 Min. :12.00
## 2013-08-28 20:31:41.107000000: 4 D : 5153 1st Qu.:36.00
## 2013-09-08 09:27:44.853000000: 4 B : 4389 Median :36.00
## 2013-12-06 05:43:13.830000000: 4 AA : 3509 Mean :40.83
## 2013-12-06 11:44:58.283000000: 4 HR : 3508 3rd Qu.:36.00
## 2013-08-21 07:25:22.360000000: 3 (Other): 6745 Max. :60.00
## (Other) :113912 NA's :84984
## LoanStatus ClosedDate
## Current :56576 2014-03-04 00:00:00: 105
## Completed :38074 2014-02-19 00:00:00: 100
## Chargedoff :11992 2014-02-11 00:00:00: 92
## Defaulted : 5018 2012-10-30 00:00:00: 81
## Past Due (1-15 days) : 806 2013-02-26 00:00:00: 78
## Past Due (31-60 days): 363 (Other) :54633
## (Other) : 1108 NA's :58848
## BorrowerAPR BorrowerRate LenderYield
## Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:0.15629 1st Qu.:0.1340 1st Qu.: 0.1242
## Median :0.20976 Median :0.1840 Median : 0.1730
## Mean :0.21883 Mean :0.1928 Mean : 0.1827
## 3rd Qu.:0.28381 3rd Qu.:0.2500 3rd Qu.: 0.2400
## Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.116 1st Qu.:0.042 1st Qu.: 0.074
## Median : 0.162 Median :0.072 Median : 0.092
## Mean : 0.169 Mean :0.080 Mean : 0.096
## 3rd Qu.: 0.224 3rd Qu.:0.112 3rd Qu.: 0.117
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :29084 NA's :29084 NA's :29084
## ProsperRating..numeric. ProsperRating..Alpha. ProsperScore
## Min. :1.000 C :18345 Min. : 1.00
## 1st Qu.:3.000 B :15581 1st Qu.: 4.00
## Median :4.000 A :14551 Median : 6.00
## Mean :4.072 D :14274 Mean : 5.95
## 3rd Qu.:5.000 E : 9795 3rd Qu.: 8.00
## Max. :7.000 (Other):12307 Max. :11.00
## NA's :29084 NA's :29084 NA's :29084
## ListingCategory..numeric. BorrowerState Occupation
## Min. : 0.000 CA :14717 Other :28617
## 1st Qu.: 1.000 TX : 6842 Professional :13628
## Median : 1.000 NY : 6729 Computer Programmer: 4478
## Mean : 2.774 FL : 6720 Executive : 4311
## 3rd Qu.: 3.000 IL : 5921 Teacher : 3759
## Max. :20.000 (Other):67493 (Other) :55556
## NA's : 5515 NA's : 3588
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## Employed :67322 Min. : 0.00 False:56459
## Full-time :26355 1st Qu.: 26.00 True :57478
## Self-employed: 6134 Median : 67.00
## Not available: 5347 Mean : 96.07
## Other : 3806 3rd Qu.:137.00
## (Other) : 2718 Max. :755.00
## NA's : 2255 NA's :7625
## CurrentlyInGroup GroupKey
## False:101218 783C3371218786870A73D20: 1140
## True : 12719 3D4D3366260257624AB272D: 916
## 6A3B336601725506917317E: 698
## FEF83377364176536637E50: 611
## C9643379247860156A00EC0: 342
## (Other) : 9634
## NA's :100596
## DateCreditPulled CreditScoreRangeLower CreditScoreRangeUpper
## 2013-12-23 09:38:12: 6 Min. : 0.0 Min. : 19.0
## 2013-11-21 09:09:41: 4 1st Qu.:660.0 1st Qu.:679.0
## 2013-12-06 05:43:16: 4 Median :680.0 Median :699.0
## 2014-01-14 20:17:49: 4 Mean :685.6 Mean :704.6
## 2014-02-09 12:14:41: 4 3rd Qu.:720.0 3rd Qu.:739.0
## 2013-09-27 22:04:54: 3 Max. :880.0 Max. :899.0
## (Other) :113912 NA's :591 NA's :591
## FirstRecordedCreditLine CurrentCreditLines OpenCreditLines
## 1993-12-01 00:00:00: 185 Min. : 0.00 Min. : 0.00
## 1994-11-01 00:00:00: 178 1st Qu.: 7.00 1st Qu.: 6.00
## 1995-11-01 00:00:00: 168 Median :10.00 Median : 9.00
## 1990-04-01 00:00:00: 161 Mean :10.32 Mean : 9.26
## 1995-03-01 00:00:00: 159 3rd Qu.:13.00 3rd Qu.:12.00
## (Other) :112389 Max. :59.00 Max. :54.00
## NA's : 697 NA's :7604 NA's :7604
## TotalCreditLinespast7years OpenRevolvingAccounts
## Min. : 2.00 Min. : 0.00
## 1st Qu.: 17.00 1st Qu.: 4.00
## Median : 25.00 Median : 6.00
## Mean : 26.75 Mean : 6.97
## 3rd Qu.: 35.00 3rd Qu.: 9.00
## Max. :136.00 Max. :51.00
## NA's :697
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## Min. : 0.0 Min. : 0.000 Min. : 0.000
## 1st Qu.: 114.0 1st Qu.: 0.000 1st Qu.: 2.000
## Median : 271.0 Median : 1.000 Median : 4.000
## Mean : 398.3 Mean : 1.435 Mean : 5.584
## 3rd Qu.: 525.0 3rd Qu.: 2.000 3rd Qu.: 7.000
## Max. :14985.0 Max. :105.000 Max. :379.000
## NA's :697 NA's :1159
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## Min. : 0.0000 Min. : 0.0 Min. : 0.000
## 1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 0.000
## Median : 0.0000 Median : 0.0 Median : 0.000
## Mean : 0.5921 Mean : 984.5 Mean : 4.155
## 3rd Qu.: 0.0000 3rd Qu.: 0.0 3rd Qu.: 3.000
## Max. :83.0000 Max. :463881.0 Max. :99.000
## NA's :697 NA's :7622 NA's :990
## PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
## Min. : 0.0000 Min. : 0.000 Min. : 0
## 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 3121
## Median : 0.0000 Median : 0.000 Median : 8549
## Mean : 0.3126 Mean : 0.015 Mean : 17599
## 3rd Qu.: 0.0000 3rd Qu.: 0.000 3rd Qu.: 19521
## Max. :38.0000 Max. :20.000 Max. :1435667
## NA's :697 NA's :7604 NA's :7604
## BankcardUtilization AvailableBankcardCredit TotalTrades
## Min. :0.000 Min. : 0 Min. : 0.00
## 1st Qu.:0.310 1st Qu.: 880 1st Qu.: 15.00
## Median :0.600 Median : 4100 Median : 22.00
## Mean :0.561 Mean : 11210 Mean : 23.23
## 3rd Qu.:0.840 3rd Qu.: 13180 3rd Qu.: 30.00
## Max. :5.950 Max. :646285 Max. :126.00
## NA's :7604 NA's :7544 NA's :7544
## TradesNeverDelinquent..percentage. TradesOpenedLast6Months
## Min. :0.000 Min. : 0.000
## 1st Qu.:0.820 1st Qu.: 0.000
## Median :0.940 Median : 0.000
## Mean :0.886 Mean : 0.802
## 3rd Qu.:1.000 3rd Qu.: 1.000
## Max. :1.000 Max. :20.000
## NA's :7544 NA's :7544
## DebtToIncomeRatio IncomeRange IncomeVerifiable
## Min. : 0.000 $25,000-49,999:32192 False: 8669
## 1st Qu.: 0.140 $50,000-74,999:31050 True :105268
## Median : 0.220 $100,000+ :17337
## Mean : 0.276 $75,000-99,999:16916
## 3rd Qu.: 0.320 Not displayed : 7741
## Max. :10.010 $1-24,999 : 7274
## NA's :8554 (Other) : 1427
## StatedMonthlyIncome LoanKey TotalProsperLoans
## Min. : 0 CB1B37030986463208432A1: 6 Min. :0.00
## 1st Qu.: 3200 2DEE3698211017519D7333F: 4 1st Qu.:1.00
## Median : 4667 9F4B37043517554537C364C: 4 Median :1.00
## Mean : 5608 D895370150591392337ED6D: 4 Mean :1.42
## 3rd Qu.: 6825 E6FB37073953690388BC56D: 4 3rd Qu.:2.00
## Max. :1750003 0D8F37036734373301ED419: 3 Max. :8.00
## (Other) :113912 NA's :91852
## TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 9.00 1st Qu.: 9.00
## Median : 16.00 Median : 15.00
## Mean : 22.93 Mean : 22.27
## 3rd Qu.: 33.00 3rd Qu.: 32.00
## Max. :141.00 Max. :141.00
## NA's :91852 NA's :91852
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.61 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :91852 NA's :91852
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0
## 1st Qu.: 3500 1st Qu.: 0
## Median : 6000 Median : 1627
## Mean : 8472 Mean : 2930
## 3rd Qu.:11000 3rd Qu.: 4127
## Max. :72499 Max. :23451
## NA's :91852 NA's :91852
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-209.00 Min. : 0.0
## 1st Qu.: -35.00 1st Qu.: 0.0
## Median : -3.00 Median : 0.0
## Mean : -3.22 Mean : 152.8
## 3rd Qu.: 25.00 3rd Qu.: 0.0
## Max. : 286.00 Max. :2704.0
## NA's :95009
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 0.0 Min. : 1
## 1st Qu.: 9.00 1st Qu.: 6.0 1st Qu.: 37332
## Median :14.00 Median : 21.0 Median : 68599
## Mean :16.27 Mean : 31.9 Mean : 69444
## 3rd Qu.:22.00 3rd Qu.: 65.0 3rd Qu.:101901
## Max. :44.00 Max. :100.0 Max. :136486
## NA's :96985
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 2014-01-22 00:00:00: 491 Q4 2013:14450
## 1st Qu.: 4000 2013-11-13 00:00:00: 490 Q1 2014:12172
## Median : 6500 2014-02-19 00:00:00: 439 Q3 2013: 9180
## Mean : 8337 2013-10-16 00:00:00: 434 Q2 2013: 7099
## 3rd Qu.:12000 2014-01-28 00:00:00: 339 Q3 2012: 5632
## Max. :35000 2013-09-24 00:00:00: 316 Q2 2012: 5061
## (Other) :111428 (Other):60343
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 63CA34120866140639431C9: 9 Min. : 0.0 Min. : -2.35
## 16083364744933457E57FB9: 8 1st Qu.: 131.6 1st Qu.: 1005.76
## 3A2F3380477699707C81385: 8 Median : 217.7 Median : 2583.83
## 4D9C3403302047712AD0CDD: 8 Mean : 272.5 Mean : 4183.08
## 739C338135235294782AE75: 8 3rd Qu.: 371.6 3rd Qu.: 5548.40
## 7E1733653050264822FAA3D: 8 Max. :2251.5 Max. :40702.39
## (Other) :113888
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 500.9 1st Qu.: 274.87 1st Qu.: -73.18
## Median : 1587.5 Median : 700.84 Median : -34.44
## Mean : 3105.5 Mean : 1077.54 Mean : -54.73
## 3rd Qu.: 4000.0 3rd Qu.: 1458.54 3rd Qu.: -13.92
## Max. :35000.0 Max. :15617.03 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.00 Median : 0.0 Median : 0.0
## Mean : -14.24 Mean : 700.4 Mean : 681.4
## 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.7000 Min. : 0.00000
## 1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.00 Median :1.0000 Median : 0.00000
## Mean : 25.14 Mean :0.9986 Mean : 0.04803
## 3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.90 Max. :1.0125 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 2.00
## Median : 0.00000 Median : 0.00 Median : 44.00
## Mean : 0.02346 Mean : 16.55 Mean : 80.48
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 115.00
## Max. :33.00000 Max. :25000.00 Max. :1189.00
##
Based on the plot, the number of listed loans decreased sharply from 2008 to 2009. This may due to the depression of global economy, for which people don’t have extra money to do investments and even not believe in the market. After 2009, with the increasing of the popularity of P2P and the economy was becoming stable, people were believing in the market and willing to lend their money to others, for which the number of listed loans kept increasing year by year. The business development of Prosper is on good trend and very stable these years.
Based on the plot, loans had a lender yield from 0.125 to 0.175 were most popular. With the increasing of the lender yiled after 0.175, the number of listed loans decreases.
The loan with a purpose of debt consolidation is significantly more than others. This means Prosper is not the first choice for most people to get a loan. Getting loans from Prosper is for paying for other loans for most people.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6500 8337 12000 35000
Based on the plot, the loan with an amount from 1000 to 5000 is significantly more than others. 35000 may be the allowable upper limit of the loan on Prosper.
For this question, we created the newStatus variable, of which the purpose was to divide loans into two groups: good, bad. We regraded the status of “Completed” and “Current” as “good loan” but other status as “bad loan”.
## [1] "Completed" "Current"
## [3] "Past Due (1-15 days)" "Defaulted"
## [5] "Chargedoff" "Past Due (16-30 days)"
## [7] "Cancelled" "Past Due (61-90 days)"
## [9] "Past Due (31-60 days)" "Past Due (91-120 days)"
## [11] "FinalPaymentInProgress" "Past Due (>120 days)"
## [1] "Good" "Bad"
The number of good loans is three times more than the number of bad loans. Thus, the management and operation work of Prosper is excellent and the business is of low risks. Prosper is under good development.
The employed borrowers were significantly more than unemployed borrowers. The full-time employed borrowers were more than part-time borrowers. This indicates that the criterion of getting a loan from Prosper is strict and most unemployed people cannot get a loan from Prosper.
The longer the employment duration is, the less the number of loans is. This may because people who have an entry-level job do not earn much money so that they have to borrow the money to do other things or people who start their own business in early stage of their career need some money to start.
Based on the plot, borrowers who had an income range frome 25000 to 75000 were more than others. Besides, an annual income above 25000 dollors might be a good support to get loans from Prosper. Not being employed and having an annual income less than 25000 dollors would improve the difficulty to get loans.
Based on the plot, the debt to income ratio from 0.175 to 0.225 counts most. With this indicator increasing, the count decreases.
In 2011, the lender yield was at the peak. After that year, it kept decreasing. The reason may be that P2P loan became increasingly popular these years, for which more and more lenders wanted to borrow their money to others through P2P platforms. Since the market became more competitive, lenders decreased their yields to ensure that their money could be successfully borrowed. The economy depression in 2008 also influenced the lender yield.
With the increasing popularity of P2P loans and the increasingly strict policy and supervision on P2P loans, people were increasingly believing in this kind of loans, for which more investors lended money on Prosper. This is the reason why the loan amount kept increasing these years.
Based on the plot, from 2009, there seemed to be a new criterion for borrowers’ credit scores. With the increasing popularity of P2P loans, more and more people who did not have a high credit score started to try to apply loans from prosper, for which the average credit score of borrowers after 2009 slightly decreased.
The dataset doesn’t contain the whole data of the year 2005 and 2014, so I removed these two when doing the research on the count on the loan status. Based on the plot, from 2009 to 2013, the count of ‘good loan’ increased sharply whereas the count of ‘bad loan’ kept almost the same. This means the market environment is pretty good these years.
From the plot, the percentage of good loans for employed borrowers is higher than unemployed. But, it is surprising that the percentage of good loans for full-time borrower is lower than that for part-time borrower. Also, self-employed borrowers performed better than full-time and part-time borrowers. Besides, the retired borrowers performed worst.
Based on the plot, it is apparently that with the increasing on income ranges, the percentage of good loans in total loans increases. The higher the borrower’s income is, the lower the risk of the loan.
The employment status duration of borrowers for good loans is slightly higher than which is for bad loans.
Based on that plot, the debt to income ratio does not have influences on the loan status. The mean value and quantile values are almost the same between two loan status. This may be because the process of reviewing and the approval on the loan focuses much on the debt to income ratio of the applicant of the loan, for which the strict process prevent from admitting loans to people who have a high debt to income ratio.
Based on the plot, large amount loans are only given to borrowers who have a debt to income ratio lower than 0.5. The higher the debt to income ratio is, the more difficult the person can get such a loan. The loan amount also has an apparent level, such as 5000, 10000, 15000, …, these amounts are popular in loans on prosper.
Based on the plot, full-time employed borrowers and part-time employed borrowers have the lowest borrower APR. On contrary, unemployed borrowers have the highest borrower APR. The reason may be that some loans are limited to employed borrowers but for unemployed borrowers, there are limited number of loans which have high rates that they can apply.
Based on the plot, with the increasing of borrowers’ prosper rating, the lender yield decreases.
Based on the plot, with the increasing of loan amounts, the borrower APR decreases. Besides, the higher the prosper rating is, the lower the APR is.
Based on the plot, when the prosper score is low, the debt to income ratio is a significant indicator to determine the risk of a loan. However, when the prosper score is high, the debt to income ratio is not so important on the risk of a loan.
Based on the plot, the borrower APR has a positive linear relation with the debt to income ratio for employed borrowers, especially for full-time employed borrowers. For unemployed borrowers, the relation is not apparent.
Based on the plot, when the employment duration is short, the borrower APR is significant on the loan risk. People have a short employment duration are not stable and have less revenues. Therefore, when facing a high APR, these type of borrowers probably lose the ability to pay the loan due to unemployments and low revenues. However, for people who have a long employment duration, their salaries are always higher and they are more stable, for which APR becomes less important to the risk of the loan.
Based on the plot, full-time employed borrowers and part-time employed borrowers have the lowest borrower APR. On contrary, unemployed borrowers have the highest borrower APR. The reason may be that some loans are limited to employed borrowers but for unemployed borrowers, there are limited number of loans which have high rates that they can apply.
From the plot, the percentage of good loans for employed borrowers is higher than unemployed. But, it is surprising that the percentage of good loans for full-time borrower is lower than that for part-time borrower. Also, self-employed borrowers performed better than full-time and part-time borrowers. Besides, the retired borrowers performed worst.
Based on the plot, when the employment duration is short, the borrower APR is significant on the loan risk. People have a short employment duration are not stable and have less revenues. Therefore, when facing a high APR, these type of borrowers probably lose the ability to pay the loan due to unemployments and low revenues. However, for people who have a long employment duration, their salaries are always higher and they are more stable, for which APR becomes less important to the risk of the loan.
In the analysis, the first difficulty I met was the lack of integraty of data with time series. I solved it by limiting the time period. The second one was the complexity of variables. Some categorical variables have 8 - 10 labels. To solve this point, I created several new variables to re-categorize the values to make the analysis easy. The third one was I was lack of the knowledge of finance, for which it was not easy to understand some variables in this dataset. I just tried my best to get a good understanding on variables and ignored the variables which were not easy to understand.
For future work, there are some point that can be further explored. Firstly, the loan status was transfferd to a simple representation in my research. Instead of just two labels (good, bad) in this variable, we can use the original loan status variable which is more complex to do in-depth research. Besides, more features can be considerred in the research in future, for which we may find more patterns and more relationships between variables. In addition, predictive data analytics method can be applied on the research.